50 research outputs found

    Clustering in Hypergraphs to Minimize Average Edge Service Time

    Get PDF
    We study the problem of clustering the vertices of a weighted hypergraph such that on average the vertices of each edge can be covered by a small number of clusters. This problem has many applications such as for designing medical tests, clustering files on disk servers, and placing network services on servers. The edges of the hypergraph model groups of items that are likely to be needed together, and the optimization criteria which we use can be interpreted as the average delay (or cost) to serve the items of a typical edge. We describe and analyze algorithms for this problem for the case in which the clusters have to be disjoint and for the case where clusters can overlap. The analysis is often subtle and reveals interesting structure and invariants that one can utilize

    Codes for Load Balancing in TCAMs: Size Analysis

    Full text link
    Traffic splitting is a required functionality in networks, for example for load balancing over paths or servers, or by the source's access restrictions. The capacities of the servers (or the number of users with particular access restrictions) determine the sizes of the parts into which traffic should be split. A recent approach implements traffic splitting within the ternary content addressable memory (TCAM), which is often available in switches. It is important to reduce the amount of memory allocated for this task since TCAMs are power consuming and are often also required for other tasks such as classification and routing. Recent works suggested algorithms to compute a smallest implementation of a given partition in the longest prefix match (LPM) model. In this paper we analyze properties of such minimal representations and prove lower and upper bounds on their size. The upper bounds hold for general TCAMs, and we also prove an additional lower-bound for general TCAMs. We also analyze the expected size of a representation, for uniformly random ordered partitions. We show that the expected representation size of a random partition is at least half the size for the worst-case partition, and is linear in the number of parts and in the logarithm of the size of the address space

    Efficient Measurement on Programmable Switches Using Probabilistic Recirculation

    Full text link
    Programmable network switches promise flexibility and high throughput, enabling applications such as load balancing and traffic engineering. Network measurement is a fundamental building block for such applications, including tasks such as the identification of heavy hitters (largest flows) or the detection of traffic changes. However, high-throughput packet processing architectures place certain limitations on the programming model, such as restricted branching, limited capability for memory access, and a limited number of processing stages. These limitations restrict the types of measurement algorithms that can run on programmable switches. In this paper, we focus on the RMT programmable high-throughput switch architecture, and carefully examine its constraints on designing measurement algorithms. We demonstrate our findings while solving the heavy hitter problem. We introduce PRECISION, an algorithm that uses \emph{Probabilistic Recirculation} to find top flows on a programmable switch. By recirculating a small fraction of packets, PRECISION simplifies the access to stateful memory to conform with RMT limitations and achieves higher accuracy than previous heavy hitter detection algorithms that avoid recirculation. We also analyze the effect of each architectural constraint on the measurement accuracy and provide insights for measurement algorithm designers.Comment: To appear in IEEE ICNP 201

    Avoiding Flow Size Overestimation in the Count-Min Sketch with Bloom Filter Constructions

    Get PDF
    The Count-Min sketch is the most popular data structure for flow size estimation, a basic measurement task required in many networks. Typically the number of potential flows is large, eliminating the possibility to maintain a counter per flow within memory of high access rate. The Count-Min sketch is probabilistic and relies on mapping each flow to multiple counters through hashing. This implies potential estimation error such that the size of a flow is overestimated when all flow counters are shared with other flows with observed traffic. Although the error in the estimation can be probabilistically bounded, many applications can benefit from accurate flow size estimation and the guarantee to completely avoid overestimation. We describe a design of the Count-Min sketch with accurate estimations whenever the number of flows with observed traffic follows a known bound, regardless of the identity of these particular flows. We make use of a concept of Bloom filters that avoid false positives and indicate the limitations of existing Bloom filter designs towards accurate size estimation. We suggest new Bloom filter constructions that allow scalability with the support for a larger number of flows and explain how these can imply the unique guarantee of accurate flow size estimation in the well known Count-Min sketch.Ori Rottenstreich was partially supported by the German-Israeli Foundation for Scientic Research and Development (GIF), by the Gordon Fund for System Engineering as well as by the Technion Hiroshi Fujiwara Cyber Security Research Center and the Israel National Cyber Directorate. Pedro Reviriego would like to acknowledge the sup-port of the ACHILLES project PID2019-104207RB-I00 and the Go2Edge network RED2018-102585-T funded by the Spanish Ministry of Science and Innovation and of the Madrid Community research project TAPIR-CM grant no. P2018/TCS-4496

    Invertible Bloom Lookup Tables with Listing Guarantees

    Full text link
    The Invertible Bloom Lookup Table (IBLT) is a probabilistic concise data structure for set representation that supports a listing operation as the recovery of the elements in the represented set. Its applications can be found in network synchronization and traffic monitoring as well as in error-correction codes. IBLT can list its elements with probability affected by the size of the allocated memory and the size of the represented set, such that it can fail with small probability even for relatively small sets. While previous works only studied the failure probability of IBLT, this work initiates the worst case analysis of IBLT that guarantees successful listing for all sets of a certain size. The worst case study is important since the failure of IBLT imposes high overhead. We describe a novel approach that guarantees successful listing when the set satisfies a tunable upper bound on its size. To allow that, we develop multiple constructions that are based on various coding techniques such as stopping sets and the stopping redundancy of error-correcting codes, Steiner systems, and covering arrays as well as new methodologies we develop. We analyze the sizes of IBLTs with listing guarantees obtained by the various methods as well as their mapping memory consumption. Lastly, we study lower bounds on the achievable sizes of IBLT with listing guarantees and verify the results in the paper by simulations

    Adaptive one memory access bloom filters

    Get PDF
    Bloom filters are widely used to perform fast approximate membership checking in networking applications. The main limitation of Bloom filters is that they suffer from false positives that can only be reduced by using more memory. We suggest to take advantage of a common repetition in the identity of queried elements to adapt Bloom filters for avoiding false positives for elements that repeat upon queries. In this paper, one memory access Bloom filters are used to design an adaptation scheme that can effectively remove false positives while completing all queries in a single memory access. The proposed filters are well suited for scenarios on which the number of memory bits per element is low and thus complement existing adaptive cuckoo filters that are not efficient in that case. The evaluation results using packet traces show that the proposed adaptive Bloom filters can significantly reduce the false positive rate in networking applications with the single memory access. In particular, when using as few as four bits per element, false positive rates below 5% are achieved.This work was supported by the ACHILLES project PID2019-104207RB-I00 and the Go2Edge network RED2018-102585-T funded by the Spanish Agencia Estatal de Investigación (AEI) 10.13039/501100011033 and by the Madrid Community research project TAPIR-CM grant no. P2018/TCS-4496
    corecore